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Abstract 

One of the major challenges for collective intelligence is inconsistency, 
which is unavoidable whenever subjective assessments are involved. Pair¬ 
wise comparisons allow one to represent such subjective assessments and 
to process them by analyzing, quantifying and identifying the inconsis¬ 
tencies. 

We propose using smaller scales for pairwise comparisons and provide 
mathematical and practical justifications for this change. Our postulate’s 
aim is to initiate a paradigm shift in the search for a better scale construc¬ 
tion for pairwise comparisons. Beyond pairwise comparisons, the results 
presented may be relevant to other methods using subjective scales. 

Keywords: pairwise comparisons, collective intelligence, scale, subjective as¬ 
sessment, inaccuracy, inconsistency. 
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1 Introduction 


Collective intelligence (Cl) practitioners face many challenges as collaboration, 
especially involving highly trained intellectuals, is not easy to manage. One 
of the important aspects of collaboration is inconsistency arising from differ¬ 
ent points of view on the same issue. According to [53], “Inconsistent knowl¬ 
edge management (IKM) is a subject which is the common point of knowledge 
management and conflict resolution. IKM deals with methods for reconciling 
inconsistent content of knowledge. Inconsistency in the sense of logic has been 
known for a long time. Inconsistency of this kind refers to a set of logical formu¬ 
lae which have no common model.” and “The need for knowledge inconsistency 
resolution arises in many practical applications of computer systems. This kind 
of inconsistency results from the use of various sources of knowledge in realiz¬ 
ing practical tasks. These sources often are autonomous and they use different 
mechanisms for processing knowledge about the same real world. This can lead 
to inconsistency.” 

Unfortunately, inconsistency is often taken for a synonym of inaccuracy but 
it is a “higher level” concept. Inconsistency indicates that inaccuracy of some 
sort is present in the system. Certainly, inaccuracy by itself would not take 
place if we were aware of it. We will illustrate it in a humorous way. When 
a wrong phone call is placed, the caller usually apologizes by “I am sorry, I 
have the wrong number” and may hear in reply: “if it is a wrong number, why 
have you dialed it?” Of course we would have not dialed the number if we had 
known that it was wrong. In fact, the respondent is the one who detects the 
incorrectness, not the caller. 

However, a self correction may also take place in some other cases, for exam¬ 
ple, via an analysis of our own assessments for inconsistency by comparing them 
in pairs. Highly subjective stimuli often are present in the assessment of public 
safety or public satisfaction. Similarly, decision making, as an outcome of men¬ 
tal processes (cognitive process), is also based on mostly subjective assessments 
for the selection of an action among several alternatives. We can compute the 
inconsistency indicator of our assessments (subjective or not) rarely getting zero 
which stands for fully consistent assessments. 

As the membership function of a fuzzy set is a generalization of the indica¬ 
tor function in classical sets, the inconsistency indicator is related to the degree 
of contradictions existing in the assessments. In fuzzy logic, the membership 
function represents the degree of truth. Similarly, the inconsistency indicator 
is related to both the degree of inaccuracy and contradiction. Degrees of truth 
are often confused with probabilities, although they are conceptually distinct. 
Fuzzy truth represents membership in vaguely defined sets but not the likeli¬ 
hood of some event or condition. Likewise, the inconsistency indicator is not a 
probability of contradictions but the degree of contradiction. 

In our opinion, pairwise comparisons method is one of the most feasible 
representations of collective intelligence. It also allows one to measure it, for 
example, by comparing Cl with individual intelligence. (According to the online 
Handbook of Collective Intelligence, hosted at the website of MIT Center of Col- 
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lective Intelligence http: //cci .mit. edu/research/index.html, measuring Cl 
is one of two main projects for developing theories of CL) Pairwise comparisons 
are easy to use, but may require complex computations to interpret them prop¬ 
erly. This is why we address the fundamental issue of scales of measure, which - 
in particular - may have an effect on feasibility of some computational schemes. 


2 Pairwise comparisons preliminaries 

Comparing objects and concepts in pairs can be traced to the origin of science or 
even earlier - perhaps to the stone age. It is not hard to imagine that our ances¬ 
tors must have compared “chicken and fish”, holding each of them in a separate 
hand, for trading purposes. The use of pairwise comparisons is still considered 
as one of the most puzzling, intriguing, and controversial scientific method al¬ 
though the first published use of pairwise comparisons (PC) is attributed to 
Condorcet in 1785 (see [T^], four years before the French Revolution). Ramon 
Llull, or Raimundus Lullus designed an election method around 1275 in |5D] . 
His approach promoted the use of pairwise comparisons. However, neither Llull 
nor Condorcet used a scale for pairwise comparisons. 

Condorcet was the first who used a kind of binary version of pairwise com¬ 
parisons to reflect the preference in the voting by the won-lost situation. In 
l3Zj, a psychological continuum was defined by Thurstone in 1927 with the scale 
values as the medians of the distributions of judgments on the psychological 
continuum. 

In [34], Saaty proposed a finite (nine point) scale in 1977. In |26|, Koczkodaj 
proposed a smaller five point scale with the distance-based inconsistency indi¬ 
cator. This smaller scale better hts the heuristic “off by one grade or less'' for 
the acceptable level of inconsistency proposed in [^. We will show here that 
a new convexity hnding for the first time supports the use of an even smaller 
scale. 

Mathematically, annxn real matrix A = [atj] is a pairwise comparison (PC) 
matrix if > 0 and Oij = l/oji for all i, j = 1,... ,n. Elements Oij represent 
a result of (often subjectively) comparing the ith alternative (or stimuli) with 
the jth alternative according to a given criterion. A PC matrix A is consistent 
if aijQjk = ttik for all i,j,k = 1,..., n. It is easy to see that a PC matrix A 
is consistent if and only if there exists a positive n-vector w such that a^- = 
Wi/wj,i,j = l,...,n. For a consistent PC matrix A, the values Wi serve as 
priorities or implicit weights of the importance of alternatives. 


3 The pairwise comparisons scale problem 

Thurstone’s approach was extensively analyzed and elaborated on in the liter¬ 
ature, in particular by Luce and Edwards [55] in 1958. The bottom line is that 
subjective quantitative assessments are not easy to provide. Not only is the 
dependence between the stimuli and their assessments usually nonlinear, but 
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the exact nature of the nonlinearity is in general unclear. In this context, a 
smaller scale is expected to generate a smaller error, for example by mitigating 
the deviation from nonlinearity. 

On page 236 in [55], authors wrote: W.J. McGill is currently attempting 
to find a better way of respecting individual differences while still obtaining a 
“universal scale”. Authors of this study have not been able to trace any publi¬ 
cation of the late W.J. McGill on the “universal scale” construction. However, 
the proposed smaller scale may be at least some kind of temporary solution as a 
reflection of “the small is beautiful” movement inspired by Leopold Kohr by his 
opposition to the “cult of bigness” in social organization. The smaller five-point 
scale better fits the heuristic “off by one grade or less” for the acceptable level 
of the distance-based inconsistency (as proposed in |55|). We will show here 
that the new convexity finding, for the first time, supports the use of an even 
smaller scale. 

There are strong opponents of the pairwise comparisons method going as far 
as opposing the use of pairwise comparisons altogether. However, they forget 
that every measurement, e.g., of length, is based on pairwise comparisons since 
we compare the measured object with some assumed unit. For example, one 
meter is the basic unit of length in the International System of Units (SI). It 
was literally defined as a distance between two marks on a platinum-iridium 
bar. Evidently, we are unable to eliminate pairwise comparisons from science 
hence we need to improve them. As we will demonstrate, it is the issue of scale 
(in other words the input data) and, as such, it cannot be ignored. 

4 In search of the nearest consistent pairwise 
comparisons matrix 

Several mathematical methods have been proposed for finding the nearest con¬ 
sistent pairwise comparisons matrix for a given inconsistent pairwise compar¬ 
isons matrix. In [34] , the eigenvector method was proposed in which w is the 
principal eigenvector of A. Another class of approaches is based on optimization 
methods and proposes different ways of minimizing (the size of) the difference 
between A and a consistent PC matrix. If the difference to be minimized is 
measured in the least-squares sense, i.e. by the Frobenius norm, then we get 
the Least Squares Method presented by Chu et al. [1^. The problem can be 
written in the mathematical form (we present the normalized version, see |20|b 


mm 



n 


S.t. 



( 1 ) 


Wj > 0, i = 1 


, . . . , 


n. 
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Since the n x n matrices in the form of columnwise ordering can also be consid¬ 
ered as n^-dimensional vectors, (say, by stacking the columns over each other), 
problem Q determines a consistent PC matrix closest to A in the sense of the 
Euclidean norm. Unfortunately, problem Q may be a difficult nonconvex opti¬ 
mization problem with several possible local optima and with possible multiple 
isolated global optimal solutions [Ml |M] • 

Some authors state that problem Q has no special tractable form and is 
difficult to solve [101 EH EH EH ■ In order to elude the difficulties caused by 
the nonconvexity of 0 . several other, more easily solvable problem forms are 
proposed to derive priority weights from an inconsistent pairwise comparison 
matrix. The Weighted Least Squares Method [Mia in the form of 

n n 

min EE {aijWj — Wi)^ 

n 

s.t. = (2) 

i=l 

iCj > 0, i = 1,... ,n 

applies a convex quadratic optimization problem whose unique optimal solution 
is obtainable by solving a set of linear equations. The Logarithmic Least Squares 
Method [M Ea in the form 

n 

EE 10^ 

i—l i<j ^ 
n 

01 ^* = 1 , 
i=l 

Wj > 0, 1 = 1,..., n 

is (because of constraints being linearizable) a simple optimization problem 
whose unique solution is the geometric mean of the rows of matrix A. For 
further approaches, see 0 EH Ea and the references therein. However, we 
have to emphasize that the main purpose of many (if not most) optimization 
approaches was to exclude the difficulties caused by the possible nonconvexity 
of problem 0 . It was usually done by sacrificing the natural approach of the 
Euclidean distance minimization. 

As with many other real-life situations, there is no possibility to decide which 
solution is the best without a clear objective function. For example, a “Formula 
One” car is not the best vehicle for a family with five children but it may be 
hard to win a Grand Prix race with a family van. In fact, pairwise comparisons 
could be used for solving the dilemma of which approximation solution is the 
best for PC (and for the family transportation problem). 

The distance minimization approach Q is so natural that one may wonder 
why it was only recently revived in [H. The considerable computational com¬ 
plexity (100 hours CPU time for n = 8) and the possibility of having multiple 


g a^J - log 


Wi 


(3) 


min 

s.t. 
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solutions (and/or multiple local minima) may be the reasonable explanation for 
not becoming popular in the past. Problem Q has recently been solved in [T] 
by reducing 100 hours of CPU (or more likely, 150 days of the CPU time) to 
milliseconds. It was asserted in mulls] that the multiple solutions are far 
enough from the ones that appear in the real-life situations. However, it appears 
that these assertions are mostly based on anecdotal evidence. More (numerical 
and/or analytical) research to elucidate this point would be helpful. 

As proved by Fiildp [20|, the necessary condition for the multiple solutions 
to appear is that the elements of the matrix A are large enough. In [2Q|, using 
the classic logarithmic transformation 


ti = log Wi, i = 1,... ,n, 


and the univariate function 

fa(t) = (e* - af + (e-* - 1/af (4) 

depending on the real parameter a, problem Q can be transformed into the 
equivalent form 


min 

s.t. 


n—1 n—2 n—1 

Efa.Jii)+E E 
2=1 2=1 j—i+1 

ti — tj — tij = 0, I = 1,...,n — 2, j = i + 1,... ,n — 1. 


(5) 


It was also proved in [20], there exists an oq > 1 such that for any a > 0 the 
univariate function fa of @ is strictly convex if and only if 1/ao < a < oq . 
Consequently, in the case when the condition 1/ao < Oy < oo is fulfilled for 
all i, j, then Q can be transformed into the convex programming problem ([^ 
with a strictly convex objective function to be minimized (see |20j . Proposition 
2). In other words, problem 0 and the equivalent problem 0 have a unique 
solution which can be found using standard local search methods. The above- 

mentioned constant equals to oq = ((123 -I- 55-\/(5))/2)^/^ = ^ (ll + 5-\/5) « 

3.330191, which is a reasonable bound for many real-life problems. The above 
oo is not necessarily a strict threshold since its proof is based on the convexity of 
univariate functions (see [20], Proposition 2, or see the Appendix of the present 
paper for a compact low-tech argument) and it is conceivable that the exact 
threshold for the sum of univariate functions is greater than ag. We know, 
however, that this threshold must be less than ai « 3.6 since, as shown by 
Bozoki [4], for any A > ai it is easy to construct a 3 x 3 PC matrix with A as the 
largest element and with multiple local minima. Finally, even if some elements 
of a PC matrix are relatively large, it may still happen that 0 has a single 
local minimum; a sample sufficient condition is given in Corollary 2 of |20j . 

A nonlinear programing solver (available in Excel and described in [T]) is 
good enough if Q has a single local minimum for a given a PC matrix A. Our 
incentive for postulating a restricted ratio scale for pairwise comparisons comes 
both from the guaranteed uniqueness in the interval determined in |20j and from 
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demonstrably possible (by [1]) non-uniqueness outside of a just slightly larger 
interval. 

There have been several inconsistency indicators proposed. The distance- 
based inconsistency (introduced in |26j l is the maximum over all triads {aik, cikj, ciij} 
of elements of A (with all indices i,j, k distinct) of their inconsistency indicators 
defined as: 


min 



^ik^kj 


Ml 


^ik^kj 

Gij 


Convergence of this inconsistency was finally provided in m ( an erlier at¬ 
tempt in [22] had a hole in the proof of Theorem 1). A modification of the 
distance-based inconsistency was proposed in 2002 in [TBj. Analysis of the 
eigenvalue-based and distance-based inconsistencies was well presented in |5]. 
Paying no attention to what we really process to get the best approximation, 
brings us what GIGO, the informal rule of “garbage in, garbage out”, so nicely 
illustrates. This is why localizing the inconsistency and reducing it is so impor¬ 
tant. 


5 The scale size problem 

As of today, the scale size problem for the PC method has not been properly 
addressed. We postulate the use of a smaller rather than larger scale and more 
research to validate it. 

As mentioned earlier, an interesting property of PC matrices has been re¬ 
cently found in [20|. Namely, Q has a unique local (thus global) optimal solu¬ 
tion and it can be easily obtained by local search techniques if I/oq < Uij < oq 
holds for all i,j = l,...,n, where the value oq is at least 3.330191 (but can 
not be larger than oi ~ 3.6, see ID)- In our opinion, this finding has a funda¬ 
mental importance for construction of any scale and we postulate the scale 1 
to 3 (1/3 to 1 for inverses) should be carefully looked at before a larger scale 
is considered. In the light of the property from [20], finding the solution of Q 
would be easier and faster. This fact should shift the research of pairwise com¬ 
parisons back toward 0 for approximations of inconsistent PC matrices. This 
is a starting point for the distance minimization approaches. It is worth to note 
that PC method is for processing subjectivity expressed by quantitative data. 
For purely quantitative data (reflecting objectively measurable even if possibly 
uncertain quantities), there are usually more precise methods (e.g., equations, 
systems of linear equations, PDFs just to name a few of them). In general, we 
are better prepared for processing quantitative data (e.g., real numbers) than 
for qualitative data. 

A comparative scale is an ordinal or rank order scale that can also be referred 
to as a non-metric scale. Respondents evaluate two or more objects at a time 
and objects are directly compared with one to another as part of the measuring 
process. In practice, using a moderate scale for expressing preferences makes 
perfect sense. When we ask someone to express his/her preference on the 0 to 
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100 scale, the natural tendency is to use numbers rounded to tens (e.g., 20, 40, 
70,...) rather than by using finer numbers. In fact, there are situations, such as 
being pregnant or not, with practically nothing between. The theory of scale 
types was proposed by Stevens in |35] . He claimed that any measurement in sci¬ 
ence was conducted using four different types of scales that he called “nominal”, 
“ordinal”, “interval”, and “ratio”. 

Measurement is defined as “the correlation of numbers with entities that are 
not numbers” by the representational theory in [32] . In the additive conjoint 
measurement (independently discovered by the economist Debreu in [15] and 
by the mathematical psychologist Luce and statistician Tukey in HH])) numbers 
are assigned based on correspondences or similarities between the structure 
of number systems and the structure of qualitative systems. A property is 
quantitative if such structural similarities can be established. It is a stronger 
form of representational theory than of Stevens, where numbers need only be 
assigned according to a rule. Information theory recognizes that all data are 
inexact and statistical in nature. Hubbard in [23] . characterizes measurement 
as: “A set of observations that reduce uncertainty where the result is expressed 
as a quantity.” 

In practice, we begin a measurement with an initial guess as to the value 
of a quantity, and then, by using various methods and instruments, try to 
reduce the uncertainty in the value. The information theory view, unlike the 
positivist representational theory, considers all measurements to be uncertain. 
Instead of assigning one value, a range of values is assigned to a measurement. 
This approach also implies that there is a continuum between estimation and 
measurement. 

The Rasch model for measurement seems to be the relevant to PC with the 
decreased scale. He uses a logistic function (or logistic curve, the most common 
sigmoid curve): P{t) = Coincidentally, the exponential function was 

used in [20] for his estimations of the upper bound of a^-. 

We mentioned that the phenomenon of the scale reduction appears implicitly 
in the Logarithmic Least Squares Method uniis] as well. It is easy to see that 
in problem (§, it is not the original PC matrix A which is approximated but 
log A which consists of the entries log . 

6 An example of a problem related to using two 
scales 

Let us look at two scales: I to 5 and 1 to 3: 


Bigger scale 

1 

2 

3 

4 

5 

Smaller scale 

1 

1.5 

2 

2.5 

3 


The inconsistent pairwise comparisons table for the 1 to 5 scale generated by 
the triad [3, 5, 3] is: 










1 

3 

5 

1 

1 

3 

I 

5 

1 

3 

1 


The inconsistency of this table is computed by (min(|l — glgj, |1 — ^|) as 4/9. 

The triad [3, 5, 3] consists of the top scale value in the middle and the middle 
scale value as the first and last values of the triad. Similarly, the inconsistent 
pairwise comparisons table for the 1 to 3 scale generated by the triad [2, 3, 2] is: 


1 

2 

3 

2 

1 

2 

1 


1 

3 

2 


The inconsistency of this table is computed by (min(|l— 5^2 I’ 1^“ ^D) 0.25. 

The middle value in the triad [2, 3, 2] is the upper bound of the scale 1 to 3. 
The other two values (2) are equal to the middle point value of the scale 1 to 
3. The same goes for all values of the triad [3, 5,3] on the scale 1 to 5 hence we 
can see that they somehow correspond to each other yet the inconsistencies are 
drastically different from each other and clearly unacceptable for the heuristic 
assumed in [55] of | for the hrst table and acceptable for the second table. 
Needless to say, there is no canonical mapping from the scale 1 to 5 to the scale 
1 to 3. The table proposed above is admittedly ad hoc and we present it for 
demonstration purposes only. 

Evidently, more research is needed for this not so recent problem. In all 
likelihood, it was mentioned for the hrst time in [55] in 1958. Most real-life 
projects using the pairwise comparisons method are impossible to replicate or 
compute for the new scale as the costs of such exercise would be substantial. 
It may take some time before a project with a double scale is launched and 
completed. 

7 The power of the number three 

The “use of three” for a comparison scale has a rehection in real life. Probably 
the greatest support for the use of three as the upper limit for a scale comes 
from the grammar. Our spoken and written language has evolved for thousands 
of years and grammar is at the core of each modern language. In his 1946 
textbook |5| (which also nicely describes the degree of comparisons as they may 
be used in PC), Bullions defines comparisons of adjectives in as: 

Adjectives denoting qualities or properties capable of increase, 
and so of existing in different degrees, assume different forms to 
express a greater or less degree of such quality or property in one 
object compared with another, or with several others. These forms 
are three, and are appropriately denominated the positive, compar¬ 
ative, and superlative. Some object to the positive being called a 
degree of comparison, because in its ordinary use it does not, like 
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the comparative and superlative forms, necessarily involve compar¬ 
ison. And they think it more philosophical to say, that the degrees 
of comparison are only two, the comparative and superlative. This, 
however, with the appearance of greater exactness is little else than 
a change of words, and a change perhaps not for the better. If we 
define a degree of comparison as a form of the adjective which nec¬ 
essarily implies comparison, this change would be just, but this is 
not what grammarians mean, when they say there are three degrees 
of comparison. Their meaning is that there are three forms of the 
adjective, each of which, when comparison is intended, expresses a 
different degree of the quality or attribute in the things compared: 
Thus, if we compare wood, stone, and iron, with regard to their 
weight, we would say “wood is heavy, stone heavier, and iron is the 
heaviest.” 

Each of these forms of the adjective in this comparison expresses 
a different degree of weight in the things compared, the positive 
heavy expresses one degree, the comparative heavier, another, and 
the superlative heaviest, a third, and of these the first is as essential 
an element in the comparison as the second, or the third. Indeed 
there never can be comparison without the statement of at least two 
degrees, and of these the positive form of the adjective either ex¬ 
pressed or implied, always expresses one. When we say “wisdom is 
more precious than rubies,” two degrees of value are compared, the 
one expressed by the comparative, “more precious,” the other neces¬ 
sarily implied. The meaning is “rubies are precious, wisdom is more 
precious.” Though, therefore, it is true, that the simple form of the 
adjective does not always, nor even commonly denote comparison, 
yet as it always does indicate one of the degrees compared whenever 
comparison exists, it seems proper to rank it with the other forms, as 
a degree of comparison. This involves no impropriety, it produces no 
confusion, it leads to no error, it has a positive foundation in the na¬ 
ture of comparison, and it furnishes an appropriate and convenient 
appellation for this form of the adjective, by which to distinguish it 
in speech from the other forms. 


8 Conclusions and final remarks 

Expressing subjective assessments with a high accuracy is really impossible, 
therefore a small comparison scale is appropriate. For example, expressing our 
pain on the scale of I to 100, or even I to 10, seems more difficult - and arguably 
less meaningful - than on the scale of I to 3. In the past, the scale I to 9 was 
proposed in [SI] and 1 to 5 in |55| . In this study, we have demonstrated that the 
use of the smaller I to 3 scale, rather than larger ones, has good mathematical 
foundations. 

More research needs to be conducted along the measurement theory lines of 
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|36j . but with emphasis on PC. In our opinion, playing endlessly with numbers 
and symbols to find a precise solutions for inherently ill-defined problems should 
be replaced by more research towards utilization of the choice theory in pairwise 
comparisons. The presented strong mathematical evidence supports the use of a 
more restricted scale. We would like to encourage other researchers to conduct 
Monte Carlo simulations with the proposed scale and to compare the results 
with those yielded by other approaches. In particular, it would be useful to 
investigate more closely the relationship between the degree of inconsistency of 
a PC matrix, the size of the scale and the possible existence of multiple local or 
global optima for the Least Squares Method (cf. [il [Ml 

The use of large scales (e.g., 1 to 10 in medicine for the pain level specification 
routinely asked in all Canadian hospitals upon admitting an emergency patient 
if he/she is still capable of talking) is a crown example of how important this 
problem may be for the improvement of daily life. Making inferences on the basis 
of meaningless numbers might have pushed other patients further in usually long 
emergency lineups. 

Although the theoretical basis for suggesting the scale 1 to 3 hinges on the 
value of the constant oq = \ (ll -I- 5v^) « 3.330191, the importance of which 

was established in [20] in the context of pairwise comparisons, its applicability 
to the universal subjective scale is a vital possibility worth further scientific 
examination. 
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Appendix 

After a change of variables to tj = log Wj,j = 1,... ,n, and a change in normal¬ 
ization to 11^=1 fhe problem Q can be rewritten as 

n 

min{^ -k - l/oy)^ : ti,...,t„eK, ^ti = 0} (6) 

i<j i=l 

Our goal is to provide a streamlined version of the argument from |20] for 
showing that if Uij ’s are not “too large”, then this minimization problem has a 
unique solution. 

The existence part is easy: if the norm of t = (G,..., t„) tends to oo, then 
- because of the constraint X]r=i tz = 0 - we must have both max^ tj —)■ -too 
and miuj tj —> —oo, hence tj — tj —>■ oo for some i,j, which forces the objective 
function to go to oo. This allows to reduce the problem to a compact subset 
of K", where existence of a minimum follows from continuity of the objective 
function. 

The uniqueness will follow if we show that the objective function in (§ - 
denote it by $ = $(ti,... ,t„) - is globally convex, and strictly convex when 
restricted to the hyperplane given by the constraint. 

For a > 0 and a; S K, we set fa(x) '■= (e^ — a)^ + (e““ — 1/a)^, then = 

Si<j — tj). Our next goal is to show that if oq := ^ (ll -I- 5-\/5) « 

3.33019 and a G [l/ao,ao], then fa is convex. Since a composition of a linear 
function with a convex function (in that order) is convex, it follows that if 
maxjj Ojj < Oo, then each term faijiU — tj) is convex, and so is <&, the entire 
sum. 

To that end, we calculate the second derivative of fa and obtain 

f”{x) = -2(a^e^ - 2a(e-2" + + e"") /a. 

Roughly, fa will be convex whenever the expression in the outer parentheses 
is negative (note that a > 0 by hypothesis). Given that the expression is a 
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quadratic function in a, this will happen when a is between the roots of this 
function, which are easily calculated to be ip{w) = (1 + rc^ —-s/l + + w® )/w^ 

and ip{w) = (1 + + vT+^ud+^u® )/w®, where w = > 0. The graphs of 

the functions a = <fiw) and a = tjjiw) can be easily rendered (see Fig. 1). In 


a 



Figure 1: The graphs of a = ^{w), a = ip(w) and a = 3. The shaded region 
ip(w) < a < ip(w) corresponds to “regions of convexity” of the functions fa- 


particular, it is apparent that there is a nontrivial range of values of a, for which 
(p{w) < a < ip{w) for all w > 0, which implies that the corresponding fa’s are 
strictly convex on their entire domain —oo<x<oo. In view of symmetries 
of the problem, that range must be of the form 1/ao < a < oq, and it is clear 
from the picture that ag > 3. For the extreme values a = ag and a = I/oq, the 
second derivative of fa will be strictly positive except at one point, which still 
implies strict convexity of fa¬ 
it is not-too-diSicult to obtain more precise results, both numerically and 
analytically. For the latter, we check directly (or deduce from symmetries of fa 
or /") that (p(l/w) = l/ip(w)] this confirms that ag := inf if (w) = 1/sup(/?(r(;), 
and so it is enough to determine ag- To apply the first derivative test to ip, we 
calculate 


if'{w) 


—3 — w‘^ + w® 
\/l + + w® 


+ w*-3- 


While this looks slightly intimidating, it is not hard to check that the only 
positive zero of if' is wg = \J\ + ^ ~ 1.27202, which also shows rigorously 
that if decreases on (0,r(;o) and increases on (wg,oo) (both strictly). Conse¬ 
quently, ag = if(wg) = I (ll -I- 5-\/5), as asserted. All these calculations can 
be done by hand, or - much faster - using a computer algebra system such as 
Mathematica, Maple, or Maxima. 

The above argument proves global convexity of $, it remains to show strict 
convexity on the hyperplane H = {t = (ti)(Li ■ ^27=1^^ ~ which is 
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equivalent to strict convexity of the restriction to any line contained in %. 
Given such line X ^ t + \u (with t,u & 'H,u ^ Q and A S K), consider any 
pair of coordinates i,j such that Ui — Uj ^ 0 and the corresponding term in the 
sum defining namely — tj) + \(ui — Uj)) =: 4>W- Clearly (j)''{X) = 

{ui — Uj^f”.. (Jyti — tj) + X{ui — Uj)) > 0, and it can vanish for at most one 
value of A (and only if = uq or aij = I/oq). Thus (f) is strictly convex, 
and since all the other terms appearing in <i> are convex, it follows that the 
restriction of $ to the line, and hence to "H, are strictly convex. It is also clear 
that if maxij < oq, the above argument yields a non-trivial lower bound on 
the positive-definiteness of the Hessian of the restriction of $ to H (this issue 
has been elaborated upon in [20]), which in particular has consequences for the 
speed of convergence of algorithms solving 
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